diamonds数据集,该数据集反应钻石的之类的四个“C”(克拉重量(carat)、切工(cut)、颜色(color)和净度(clarity)),以及五个物理指标(深度(depth)、钻面宽度(table)、x、y、z)
## carat cut color clarity depth table price x y z
## 16011 1.33 Premium H SI2 62.8 52.0 6405 7.15 7.06 4.46
## 42666 0.56 Ideal F SI1 61.5 55.0 1334 5.30 5.34 3.27
## 22048 0.31 Ideal I VS1 62.8 57.0 628 4.32 4.28 2.70
## 53655 0.76 Ideal D SI2 62.2 57.0 2706 5.85 5.83 3.63
## 52046 0.55 Very Good G VVS1 61.5 55.1 2451 5.26 5.28 3.24
## 28559 0.30 Ideal H VS1 62.1 54.0 675 4.35 4.32 2.69
## 26581 1.66 Very Good F VS2 61.4 58.0 16294 7.63 7.68 4.70
## 20715 0.31 Ideal F VS2 61.9 54.0 625 4.35 4.38 2.70
## 44985 0.31 Ideal G SI2 61.7 56.0 523 4.38 4.34 2.69
## 8184 1.02 Ideal G SI2 62.0 55.0 4366 6.40 6.51 4.00
绘制一张展现钻石价格与重量之间的关系的散点图。
qplot(carat,price,data=diamonds)
将变量的函数(log)作为参数
qplot(log(carat),log(price),data = diamonds)
钻石的体积和其质量直接的关系
qplot(carat,x*y*z,data = diamonds)
向重量和价格的散点图添加颜色和切工的信息
qplot(carat,price,data = dsmall,colour = color)
qplot(carat,price,data = dsmall,shape = cut)
使用alpha图像属性,其取值从0(完全透明)变动到1(完全不透明)
qplot(carat,price,data = diamonds,alpha = I(1/10))
qplot(carat,price,data = diamonds,alpha = I(1/100))
qplot(carat,price,data = diamonds,alpha = I(1/200))
几何对象描述了应该用何种对象来对数据进行展示,其中有些几何对象关联了对应的统计变换。它几乎可以画出任何一种类型的图形。
二维变量
- geom = “point” 绘制散点图。
- geom = “smoooth” 将拟合一条平滑曲线
- geom = “boxplot” 绘制箱线胡须图
- geom = “path” 和 geom = “line” 在数据点之间绘制连线。
一维连续变量
- geom = “histogram” 绘制直方图
- geom=“freqpoly” 绘制频率多边形
- geom = “desity” 绘制密度曲线
一维离散变量
- geom = “bar” 绘制条形图
dsmall1 <- diamonds[sample(nrow(diamonds),100),]
qplot(carat,price,data = dsmall1,geom = c("point","smooth"))
## geom_smooth: method="auto" and size of largest group is <1000, so using loess. Use 'method = x' to change the smoothing method.
qplot(carat,price,data = diamonds,geom = c("point","smooth"))
## geom_smooth: method="auto" and size of largest group is >=1000, so using gam with formula: y ~ s(x, bs = "cs"). Use 'method = x' to change the smoothing method.
- 利用method参数选择不同的平滑器:
method = “loess”,当n较小事是默认选项,使用的是局部回归的方法。关于这一算法的更多细节可以查阅帮助?loess。曲线的平滑程度是由span参数控制的,其取值范围是从0(很不平滑)到1(很平滑)。
qplot(carat,price,data = dsmall1,geom = c("point","smooth"),span =0.2)
## geom_smooth: method="auto" and size of largest group is <1000, so using loess. Use 'method = x' to change the smoothing method.
qplot(carat,price,data = dsmall1,geom = c("point","smooth"),span=1)
## geom_smooth: method="auto" and size of largest group is <1000, so using loess. Use 'method = x' to change the smoothing method.
Loess对于大数据并不十分使用(内存的消耗是O(n^2)),因此当n超过1000时将默认采用另一种平滑算法。
qplot(color,price/carat,data = diamonds,geom = "jitter",alpha = I(1/5))
qplot(color,price/carat,data = diamonds,geom = "jitter",alpha = I(1/50))
qplot(color,price/carat,data = diamonds,geom = "jitter",alpha = I(1/200))
qplot(carat,data = diamonds,geom = "histogram")
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
qplot(carat,data = diamonds,geom = "density")
对于直方图,binwidth参数设定组距,从而调节平滑度
qplot(carat,data = diamonds,geom = "histogram",binwidth = 1,xlim = c(0,3))
qplot(carat,data = diamonds,geom = "histogram",binwidth = 0.1,xlim = c(0,3))
qplot(carat,data = diamonds,geom = "histogram",binwidth = 0.01,xlim = c(0,3))
## Warning: position_stack requires constant width: output may be incorrect
**当一个分类被映射到某个图形属性上,几何对象会自动按这个变量进行拆分,因此,下述命令会告诉qplot()对每一种钻石颜色都绘制一次密度曲线和直方图。
qplot(carat,data = diamonds,geom = "density",colour = color)
qplot(carat,data = diamonds,geom = "histogram",fill = color)
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
qplot(color,data=diamonds,geom = "bar")
qplot(color,data = diamonds,geom = "bar",weight = carat) + scale_y_continuous("carat")
线条图将点从左到右进行连接,而路径图则按照点在数据集中的顺序对其进行连接(线条图就等价于将数据按照X取值进行排序,然后绘制路径图)。线条图的X轴一般是时间,它展示了单个变量随时间变换的情况。路径图则展示了两个变量随时间联动的情况,时间反映在点的顺序上。
qplot(date,unemploy/pop,data = economics,geom = "line")
qplot(date,uempmed,data = economics,geom = "line")
qplot(carat,data = diamonds,facets = color ~ .,geom = "histogram",binwidth = 0.1,xlim = c(0,3))
qplot(carat,..density..,data = diamonds,facets = color ~.,geom = "histogram",binwidth = 0.1,xlim = c(0,3))